New Approach to Frequency Dictionaries - Czech Example
نویسنده
چکیده
On the example of the recent edition of the Frequency Dictionary of Czech we describe and explain some new general principles that should be followed for getting better results for practical uses of frequency dictionaries. It is mainly adopting average reduced frequency instead of absolute frequency for ordering items. The formula for calculation of the average reduced frequency is presented in the contribution together with a brief explanation, including examples clarifying the difference between the measures. Then, the Frequency Dictionary of Czech and its parts are described.
منابع مشابه
Homonymy and Polysemy in the Czech Morphological Dictionary
We focus on a problem of homonymy and polysemy in morphological dictionaries on the example of the Czech morphological dictionary MorfFlex CZ [2]. It is not necessary to distinguish meanings in morphological dictionaries unless the distinction has consequencies in word formation or syntax. The contribution proposes several important rules and principles for achieving consistency.
متن کاملGenetic Algorithms in Syllable-Based Text Compression
Syllable based text compression is a new approach to compression by symbols. In this concept syllables are used as the compression symbols instead of the more common characters or words. This new technique has proven itself worthy especially on short to middle-length text files. The effectiveness of the compression is greatly affected by the quality of dictionaries of syllables characteristic f...
متن کاملBehaviour of the Czech Suffix -ák - A Case Study
New techniques in Czech derivational morphology are discussed. They are based on the exploitation of the tool Deriv with integrated access to the main Czech dictionaries and corpora (SYN2000c and the new large Czech corpus CzTenTen12). The case study deals especially with the Czech suffix -ák – we describe its behaviour as completely as possible. The paper brings some new results in comparison ...
متن کاملCzech MWE Database
In this paper we deal with a recently developed large Czech MWE database containing at the moment 160 000 MWEs (treated as lexical units). It was compiled from various resources such as encyclopedias and dictionaries, public databases of proper names and toponyms, collocations obtained from Czech WordNet, lists of botanical and zoological terms and others. We describe the structure of the datab...
متن کاملThe Core of the Czech Derivational Dictionary
Amongst all available language resources for the Czech language one can find a lot of useful dictionaries, databases and corpora. There are machine readable dictionaries of literary Czech (Havránek, 1989; Filipec, 1998), the dictionary of Czech synonyms (Pala, 2000) and two encyclopaedia: Otto and Diderot. Moreover, Czech researchers have two morphological databases (Hajič, 2001; Sedláček and S...
متن کامل